Solving Relational MDPs with Exogenous Events and Additive Rewards

نویسندگان

Saket Joshi

Roni Khardon

Prasad Tadepalli

Aswin Raghavan

Alan Fern

چکیده

We formalize a simple but natural subclass of service domains for relational planning problems with object-centered, independent exogenous events and additive rewards capturing, for example, problems in inventory control. Focusing on this subclass, we present a new symbolic planning algorithm which is the first algorithm that has explicit performance guarantees for relational MDPs with exogenous events. In particular, under some technical conditions, our planning algorithm provides a monotonic lower bound on the optimal value function. To support this algorithm we present novel evaluation and reduction techniques for generalized first order decision diagrams, a knowledge representation for realvalued functions over relational world states. Our planning algorithm uses a set of focus states, which serves as a training set, to simplify and approximate the symbolic solution, and can thus be seen to perform learning for planning. A preliminary experimental evaluation demonstrates the validity of our approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bellman goes Relational ( extended abstract ) 1

We introduce ReBel, a relational Bellman update operator that can be used for Markov Decision Processes in – possibly infinite – relational domains. Using ReBel we develop a relational value iteration algorithm. 1 Relational Markov Decision Processes Many reinforcement learning (RL) and dynamic programming techniques have been developed for solving Markov Decision Processes (MDP). Until recentl...

متن کامل

Approximate Policy Iteration with a Policy Language Bias: Solving Relational Markov Decision Processes

We study an approach to policy selection for large relational Markov Decision Processes (MDPs). We consider a variant of approximate policy iteration (API) that replaces the usual value-function learning step with a learning step in policy space. This is advantageous in domains where good policies are easier to represent and learn than the corresponding value functions, which is often the case ...

متن کامل

Eliciting Additive Reward Functions for Markov Decision Processes

Specifying the reward function of a Markov decision process (MDP) can be demanding, requiring human assessment of the precise quality of, and tradeoffs among, various states and actions. However, reward functions often possess considerable structure which can be leveraged to streamline their specification. We develop new, decisiontheoretically sound heuristics for eliciting rewards for factored...

متن کامل

Relational Partially Observable MDPs

Relational Markov Decision Processes (MDP) are a useful abstraction for stochastic planning problems since one can develop abstract solutions for them that are independent of domain size or instantiation. While there has been an increased interest in developing relational fully observable MDPs, there has been very little work on relational partially observable MDPs (POMDP), which deal with unce...

متن کامل

Model Reduction Techniques for Computing

We present a method for solving implicit (factored) Markov decision processes (MDPs) with very large state spaces. We introduce a property of state space partitions which we call-homogeneity. Intuitively, an-homogeneous partition groups together states that behave approximately the same under all or some subset of policies. Borrowing from recent work on model minimization in computer-aided soft...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Solving Relational MDPs with Exogenous Events and Additive Rewards

نویسندگان

چکیده

منابع مشابه

Bellman goes Relational ( extended abstract ) 1

Approximate Policy Iteration with a Policy Language Bias: Solving Relational Markov Decision Processes

Eliciting Additive Reward Functions for Markov Decision Processes

Relational Partially Observable MDPs

Model Reduction Techniques for Computing

عنوان ژورنال:

اشتراک گذاری